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MSTRACT 

Various models of educational evaluation are 
presented. These include: (1) the classical type model, which 
contains the following guidelines: formulate objectives, classify 
objectives, define objectives in behavioral terms, suggest situations 
in which achievement of objectives will be shown, develop or select 
appraisal techniques, and gather and interpret performance data; (2) 
the accreditation model, which emphasizes the process of education, 
rather than its outcomes; (3) the systems model, inherent in which is 
the idea of evaluation as a management feedback system throughout the 
course of the program; and (4) the discrepancy model, which combines 
the best available methods for using evaluation as a program 
development tcol. (CK) 
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by Lawrence VicCluskey 
Evaluation became an educational issue 
when the pattern of govei nmental funding 
of education underwent a dramatic change 
in the years following the 1957 launching of 
Russia's Sputnik satellite. The National 
Defense Education Act (1958) and the 
Elementary and Secondary- Education Act 
( 1965) caused federal support of education 
alone to almost double. But most of the 
programs funded under these new 
auspices were geared toward innovation 
and change, rather than at expanding or 
enhancing educational technique."^ that 
were previously in existence. As a result of 
this emphasis on "innovative" programs, 
governmental agencies began demanding 
some kind of monitoring progress by 
which their effectiveness could be gauged. 
The old axiom, ''Why throw good money 
after bad?'* gained new currency, and 
*• evaluation" became a new career route 
for many educational researchers. 

However, in education, as well as in 
allied fields, there is an important 
difference between "evaluative" research 
and what is often called '*pure" research. 
The difference is simply this: Where 
•*pure" research asks the question, 'Is 
treatment '"a" a suitable remedy for 
deficiency "b"?' evaluative research 
assumes tliat the answer to this question is 
ye«^ and proce2ds to examine the impact 
of the treatment on the indicated 
deficiency. In other words, "pure" 
research begins with hypcticeses, while 
^•evaluative" research begins with 
assumptions. ()r, stating the con« ji in a 
different way, the "pure" researcher 
cannot be "wrong;" his liypoth'^sis is 
either accepted or rejected according to 
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some predetermined standard. On the 
other hand, the "evaluative" researcher is 
faced with determining whether or not 
some assumption is "right or wrong." At 
least, this was the state of affairs in the 
earlier models used in evaluating 
educational programs. 

The Classical Type Model 

One of the first models used in 
evaluation studies, which is in fact 
sometimes referred to as the Classical 
Type of Evaluation, contains the following 
guidelines: 

"1. Formulate objectives. Determine 
broad goals of the program. 

2. Classify objectives. Develop a 
typology of objectives so an 
economy of thought and action may 
be achieved. 

3. Define objectives in behavioral 
terms . . . 

■1. Suggest situations in which achieve- 
ment of objectives will be shown. 

5. Develop or select appraisal tech- 
niques, (standardized tests, ad hoc 
tests, questionnaires, etc.) 

6. Gather and interpret performance 
data. The final step in the evaluation 
process involves the measurement 
of student performance data with 
behaviorally stated objectives..." ^ 

Even a cursory examination of this 
model reveals what has now been 
recognized as one of its weaknesses — its 
emphasis on examining program products 
or outcomes. In effect, it not only assumes 
the efficacj» of some activity or treatment, 
but also assumes the presence of the 
activity in an effort to produce the 
objectives formulated by the designers of 
the program. But experience has shown 
that the chief impediment in the 
implementation of a new education 
program is ofzen the failure to properly 
apply the treatments specified in the 
program design. 

Furthermore, this model assumes that 



all educational objectives can be 
measured by objective, quantifiable 
methods. Now this assumption ought be 
supported (given some operational 
definition of attainment) if educational 
objectives were confined to those areas in 
which standardized testing has been 
established, but it encounters serious 
objections when one considers such 
program objectives as "improved student 
citizenship" or "enhancing appreciation of 
individual worth." Obviously, these points 
do not imply that verifiable performance 
outcon es are not the concern of the 
evaIu3*ors. In fact, verification that the 
program has attained its objectives is the 
focal point of evaluative research. 
However, if the evaluator concentrates 
solely on the achievement of behaviorally 
defined performance objectives as this 
model dictates, he may overlook those 
aspects of the program which greatly 
influence its success or failure. 

The Accreditation Model 

Another method of program evaluation 
is what may be called the Accreditation 
Model. In this model, emphasis is on the 
process of education, rather than on the 
outcomes. The assumption made in this 
model is that improvement in the 
educational process would result in 
improvement in producing desirable 
outcomes. Criteria were developed for 
rating various components of the 
education process such as building 
facilities, library size and services, 
instruction equipment, teacher qualifi- 
cations, guidance programs, etc. Once 
these cpteria were established, a team of 
experts would visit the program site and 
rate the program on various criteria. The 
ratings of these experts could then be used 
to compare one program to another, or to 
a set of standards laid down by other 
experts in the field of education. 

Opinion is divided over the usefulness of 
the accreditation model as an evaluation 
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technique. Its adv?htages appear to be 
that it can quickly respond to the need for 
program evaluation and makes use of the 
abilities of people who are "exr-erts" in 
their fields. These advantages, however, 
may be offset by some weaknesses which 
are inherent in the model. 

Some observers have commented that 
although the accreditation model "has the 
advantages of quick response and the 
utilization of the full range of the 
evaluator's competence, it obviously 
leaves much to be desired in terms of 
ak^jectivity and validity, which are at best 
moot. "2 The "genetic defects" in the 
Accreditation Model are "that its 
practitioners do not seek to justify 
empirically the standards used to judge 
worth and that attention to tbe processes of 
education is not balanced by attention to 
its consequences on learners." 3 
The Systems Model 

Still another type of evaluation utilizes a 
Systems Approadi. Inherent in this model 
is the notion of evaluation as a 
management feedback .system throughout 
the course of the program. Evaluation in 
this context becomes a monitoring process 
which concentrates on gathering data 
about programs and providing manage- 
ment with information necessary to make 
modification and in^>rovement during the 
course of the program. A summary of the 
way the systems model operates is 
included here.* 

A specific example of an evaluation 
design based on the Systems ^^oach is 
one developed by Stufflebeam called the 
aPP model. ^ Each letter stands for a 
discrete step in evaluation: Context, Input, 
Process, Product; together, they can be 
applied to nearly any educational 
evaluation study. Each part, as Suchman 
suggested, has its own objectives. 

The major objective of context 
evaluation is to d^ine the environment 
where change is to occur, the 
environment's unmet needs, problems 
underlying those needs, and opportunities. 
Information from context evaluation is 
ultimately used to establish |H*ogram goals 
and objectives. 

The purpose of an input evaluation is to 
determine how to utilize resources to meet 
the program goals and objectives. The end 
product of such an evaluation is an 
analysis of alternative procedural designs 
in cost/benefit terms, from which the 
decision maker can select Decisions 
based upon input evaluation usually result 
in the specification of procedures, 
schedule, staff requirements, and budget. 
According to Stufflebeam, by evaluating 
the input, it can be decided whether other 
types of inputs are needed to achieve the 
objectives. 

Q Once a designed course of action has 
> ir^ been approved and implementation of the 
design has begun, process evaluation is 



needed to provide periodic feedback to 
project managers and others responsible 
for continuous control and refinement of 
plans and procedures. The objective of 
process evaluation is to detect or predict, 
during the implementation stages, defects 
in the procedural design or its 
implementation. 

Finally, product evaluation is used to 
determine the effectiveness of the project 
after it has run full cycle. Its objective is to 
relate outcomes to objectives, and to 
context and input, i.e., to measure and 
interpret outcomes. 

The aPPmodel, while offering a helpful 
theoretical frame of reference for the 
assessment of change has been found to be 
deficient as a guide for actual practice. 

complexity of its analysis of 
evaluation into many decision-making 
situations has made it unmanageable 
except in theory. "In short, while the 
proposed structure (CIPP) provides a 
general guide for develc^ing evaluation 
designs, educators must still engage 
heavily in the laborious, painstaking 
process of developing ' each design de 
novo.'*® 

The Discrepancy Model 
The dearth of readily applicable theory, 
and the virtual alienee of reported 
successful evaluation practice with a 
systems orientation was fully recognized 
by the evaluation tearn headed by 
Malcolm Provus, when they set out to 
construct a new model. "Our mandate was 
clear: Redefine the purpose of evaluation . 
. . and then devise and test an operational 
evaluation model based on sound theory . . 
." (p. 2). The Discrepancy Evaluation 
Model 1969, known as the Provus Report, 
seems to combine the best available 
methods for using evaluation as a program 
development tool as well as a means of 
program assessment, in a readily 
adaptable, workable model. 

In devel(^ing the Discrepancy Model, 
Provus proceeded on the following basic 
assumptions (among others): 
"1. Many educational programs ... are 
installed in public school systems 
without adequate planning. 
2. Given this fact, evaluation should be 
a process for program devdopment 
and stabilization, as well as a means 
of as sessm ent. To accon^Iish this 
purpose, evaluation must provide 
information which decision makers 
can use to improve, stabilize and 
assess programs, (p. 8-9). 
Provus sees evaluation, at its simplest 
level, as the comparison of performance 
against a standard. Like Sti^ebeam, he 
divides the evaluation process into stages. 
In each stage, some indicator of 
performance is obtained which is 
compared to a standard which serves as 
the criterion of performance. The 
relationship among different evaluation 



stages, ard betwt-en performance and 
standard at each stage, are illustrated 
schematicallv in Figiu*e I. 

An educational program is viewed as a 
dynamic input-ou^ut system with 
specifications for inputs, process and 
output being necessary and sufficient for 
program design. The relationship among 
these components may be represented by 
the following equation: 
I(P)-0 

where "I" - input, "P" - process, aud "0" - 
ou^ut. "Ou^uts" are viewed as a function 
of the interaction of inputs with process. 
For example, students, teachers and 
materials (inputs) interact in such a 
manner (process) as to produce a change 
in reading levels (output). The difference 
between the "goal" of the program and the 
"ou^ut" of the program should be 
minimized for program success, (p. 4). 
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In Figure I, the "standard" at stage I is 
the Design Criteria — a comprehensive list 
of program elements that make up the 
three Iwisic **syste!Tis" categories of input, 
process and output. The Design Criteria (a 
sample of which appears on page 17 of the 
Provus Report) constitutes a basic 
assumption on which all other criteria for 
standards used throughout the evaluation 
are based. Provus believes it is vital to the 
smooth operation and ultimate success of 
any program that these Design Criteria be 
formed with information provided by 
program staff, and preferably in a "design 
meeting" atteiided by thoie people. 

When the Design Criteria have been 
agreed upon, then, a description of the 
program's design is obtained as 
"performance" information. Stage I 
evaluation takes place when the program 
design is compared with the Design 
Criteria. Discrepancy between **perfor- 
mance" and "standard" is reported tc 
those responsible for management of the 
program. To eliminate the discrepancy 
and approach congruence between the 
two, adjustments may be made in either 
one. 

Once the program design has been 
established, the program is ready to be 
UTipIemented, and the "standard" against 
which the "performance" of initial 
implementation is measured becomes that 
program design (Stage II). Once again, 
discrepancy information provided by the 
evaluator may be used by the program 
manager to redefine the program or 
change installation procedures. 

Not until Stage m is any cause and 
effect comparison made. At Stage m, the 
"standard'' is that part of the program 
design which describes the relationship 



between program processes and interim 
products. Discrepancy information is used 
either to redefine process and relationship 
of process to interim product or to bett^ 
control the process being used in the field. 

At Stage IV the 'standard" is that part 
of the program design which refers to 
terminal objectives. Program "perfor- 
mance" information consists of criterion 
measures used to estimate the terminal 
effects of the project. 

Finally, at Stage IV, a cost benefit 
analysis may be done to determine 
program efficiency, using the cost of other 
programs with the same product as a 
standard. 7 

Defining Obiectives 

At this point, it is clear that if evaluation 
is to be a meaningful endeavor it must 
specify the degree to which a program has 
achieved its objectives. But this 
consideration raises a further question 
that is frequently encountered in the fidd 
situation. Many programs, especially 
those funded by state and federal 
agencies, tend to be con^>osed of multiple 
objectives, some of which are clearly 
specified in the program outline, while 
others are only hinted at. Examination of 
many program proposals reveals that the 
designers, in an effort to get the maximum 
advantage out of the program, have failed 
to ddineate between or among various 
types of objectives. As a result, evaluation 
of such programs becomes an impossible 
task, and both planners and evaluators 
find themselves frustrated in their labels. 

It appears then that great care should be 
taken in specifying program objectives 
according to some coherent plan. One such 
plan was developed by Operation PEP in 
California a few years ago. According to 
3 



this paradigm objectives are divided into 
four stages; policy objectives, program 
objectives, curriculum objectives and 
instructional objectives. Each of these 
phases in the chain of objectives is 
assigned a person or group who would be 
accountable for the objectives at that 
particular level. 

Operation PEP objectives are defined as 
follows: 

1.. Policy objectives define the perfor- 
mance conunitments of an organiza- 
tion; they define ends that must be 
adiieved to fulfill external (societal) 
and internal ( organizational) re- 
quirements. Objectives at this level 
would normally be associated with 
the policymaking body, the Board of 
Education. 

2. Program objectives are derived from 
policy objectives; they define a plan 
of action for the achievement of the 
internal and external purposes stated 
in the policy objectives. Program 
objectives would he the responsibility 
of the program director or coordi- 
nator, or possibly assistant or district 
superintendent. 

3. Curriculum objectives would define 
the performance outcomes required 
to fulfill the program requirements. 
Administrators would be accountable 
fur objectives at this level. 

4. Instructional objectives refer to 
performance in the actual teaching- 
learning process. They refer to 
individual and instructional staff 
performance products, and are 
usually in the hands of teachers.a 

Examination of this model shows that 
the more generalized the specified 
objective, the higher in the organizational 
chain responsibility for its attainment lies. 
Conversely, those objectives which are 
most specifically stated are the 
responsibility of the people who are in 
most inmiediate contact with the 
population impacted by the program. Such 
an organization allows an evaluator not 
only to recognize a program dysfunction, 
but also to trace the dysfunction to its 
source, and to feedback necessary 
information to program directors while 
there is still time to correct the 
dysfunction. Stating this same proposition 
in another way, whereas the classical 
model of evaluation would only permit an 
evaluator to state that a specific objective 
had not been achieved, this model would 
allow him to point out the reason that it 
was not achieved. For example, suppose 
that a board of education decided that 
teaching machines would enhance the 
level of student achievement (Policy 
level) and decided to purchase such 
machines for use in an individual school 
^Program level). The local school 
administrator (Curriculum level) could 
scarcely be held accountable for 



attainment of the final objective, enhanced 
student achievement, (Instructional level) 
if the machines were never delivered. In 
addition, evaluators working with such a 
model would be able to provide 
information about this specific dysfunction 
to the policy level body during the course 
of the program so that the discrepancy 
between planned outcome and actual 
conditions could be reasonably reduced. 

One More Evaluation Paradigm 
In an attempt t> draw upon the research 
that has been discussed here and to 
develop an evaluation model adaptable to 
the widest variety of program designs, a 



number of lAR staff members reviewed 
many program designs and proposals and 
attempted to apply various evaluation 
models to these designs. A second step in 
this process was to compare the original 
program designs with final evaluation 
reports. Mliis was done in an effort to 
determine the degree to which the 
programs had achieved their stated 
objectives as well as the degree to which 
evaluators might have used their findings 
to assist those responsible for implemen- 
tation of the program. As a result of this 
survey, of documents, the staff attempted 
to develop a naw evaluation model. (See 
Figure II). 



Obviously this model derives many of its 
features horn models that have been 
previously discussed, but it also contains 
two other significant features: Resource 
Allocation and Conversion. While Re- 
source Allocation is a self-explanator>' 
term, the notion of Conversion merits 
elaboration here. 

Conversion, a notion found tc be critical 
in the program planning evaluation 
interface, deals with what occurs after 
various program inputs (personnel, 
supplies, facilities, etc.) have been 
allocated and before the process t)egins. 
Let me be more specific. If we think at)Out 
any innovative program that is to t)e 
inh*oduced into a system, we soon realize 
that is illusory to imagine that the staff, 
who are unfamiliar with the new 
techniques and facilities, not to mention 
the students, will immediately t)egin to 
function at their optimum levels. Rather, 
eiqperience, as well as common sense, 
dictates that there should t>e a time of 
training and adjustment built into the time 
sequencing of the program, and that 
information on the progress of the 
program during this time b« supplie«l to 
the program planners. The necessify of 
this process is recognized in the model in 
the step called Conversion. 

Perhaps the use of this evaluation model 
might be clarified by actually applying it 
to a single program feature. To go back to 
the example cited previously, let us 
suppose that a district wished to measure 
the effectiveness of using teaching 
machines to improve student achieve- 
ment. The "completed model might 
resemble Figure 3. 



Program Assumption: 


(Develops from needs assessment/ 
treatment identification) 


Policy Statement; 


(Generalization) 


Program Goal: 


(Specific performance desired) 


Curriculum Objective: 


(Selection of one or more alternative 
means/methods/sequences to achieve 
desired performance) 


Resources Necessary and Allocated: 


(Acquisition of staff, facilities, 
materials, etc.) _ 


Conversion : 


(Training, orientation, scheduling) 


Processes: 


(Instruction, performance) 


How Performance is to be Assessed: 


(Impact analysis) 



FIGURE II 



ERIC 



4 



Program Assumption: 



Policy Sl2lemenl: 



Program Goal: 



Curriculum Objective: 



Resources Allocated and Necessary: 



Hignly structured, self-pacing instruc- 
tion that includes immediate feedback 
will improve student learning. 

Teaching machines will be used as 
part of the instructional process in 
secondary school mathematics 

To use teaching machines in instruc- 
tion of general mathematics to eighth 
grade boys in school X beginning with 
the spring semester. 

To significantly improve achievement 
in math in two classes of eighth grade 
students by using teaching machines 
in the classroom for a period of one 
hour per day. 

Teaching machines, programmed text, 
teacher, classroom, students. 



Conversion: 



Training of staff by central office math 
supervisor and familiarization of 
students by local ieachers with the 
techniques necessary to properly use 
teaching machines. 



Processes: 



Daily use of teaching machines for a 
one hour period in two fifth grade math 
classes. 



Performance Assessment: 



Statement of testing strategy and 
sequence. Analysis of test results. 



FIGURE III 



This lAR model represents an attempt to 
achieve a number of ends. First, it tries to 
insure concruency of thought between 
planner and evaluator from generalized 
policy statement to specific end desired. In 
practice, participation in the progi*am 
design should involve an array of staff 

members and district officers. The 
program structure should help to make 

clear the roles of the various participants 



in implementation. Secondly, it provides a 
means by which the evaluator can monitor 
the program throughout its duration and 
provide feedback to those responsible for 
program implementation. Furthermore, it 
also allows causes of program dysfunction 
to be quickly isolated and discrepancies 
between specification and performance to 
be reduced. Finally, it makes explicit the 
means by which achievement of program 
objectives can be measured. 
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In Measurement of School Quality and its Determiners, Vincent 
and Olson have written an indepth "History" of the work of the 
Institute of Administrative Research over the past ten years. 

In addition to a general discussion of school appraisal, they 
focus on one specific instrument. Indicators of Quality, the 
classroom observation schedule developed by the Institute of 
Administrative Research under the direction of William S. Vincent. 

By offering a systematic approach to one of the central concerns 
of educational leaders, Measurement of School Quality and its 
Determiners provides the school administratorwith stimulating and 
helpful resource material. 
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$6 per single copy 
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